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Question 3 


Intent of Question 





The primary goals of this question were to assess students’ ability to (1) compare two distributions 
presented with histograms; (2) comment on the appropriateness of using a two-sample t-procedure ina 
given setting. 


Solution 
Part (a): 


Household size tended to be larger in 1950 than in 2000. The histograms reveal a much larger proportion of 
small (1-, 2-, and 3-person) households in 2000 than in 1950. Similarly, the histograms reveal a much 
smaller proportion of large (5-person and larger) households in 2000 than in 1950. Also, the median 
household sizes can be calculated to be 5 people per household in 1950 compared with 3 or 4 people per 
household in 2000. The year 1950 displayed slightly more variability in household sizes than the year 2000. 
Although the interquartile ranges for both years are the same (3 people), the standard deviation (1950: 
about 2.6 people; 2000: about 2.1 people) and the range (1950: 13 people; 2000: 11 people) are larger for 
1950 than for 2000. Both distributions of household size are skewed to the right. In both years, there are a 
few households with very large families, as large as 14 people in 1950 and 12 people in 2000. 


Part (b): 


The conditions for applying a two-sample t-procedure are: 
1. The data come from independent random samples or from random assignment to two groups; 
2. The populations are normally distributed, or both sample sizes are large; 
3. The population sizes are at least 10 (or 20) times the sample sizes. 


The first condition is satisfied because independent random samples were selected for the years 1950 
and 2000. The second condition is satisfied because the sample sizes (500 in each group) are quite 
large, despite the right skewness of the distributions of household sizes in the sample data. The third 
condition is satisfied because the number of households in the large metropolitan area in both 1950 
and 2000 would easily exceed 10 x 500 = 5,000. 


Scoring 


This question is scored in four sections. Part (a) has three components: (1) comparing the centers of the 
two distributions; (2) comparing variability for the two distributions; (3) identifying the shapes of both 
distributions and including context related to the variable of interest. Section 1 consists of part (a), 
component 1; section 2 consists of part (a), component 2; section 3 consists of part (a), component 3. 
Section 4 consists of part (b). Sections 1 and 2 are scored as essentially correct (E) or incorrect (I). 
Sections 3 and 4 are scored as essentially correct (E), partially correct (P), or incorrect (I). 


Section 1 is scored as follows: 
Essentially correct (EF) if the response correctly compares center (or location) for both distributions. 


Incorrect (I) otherwise. 
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Question 3 (continued) 
Section 2 is scored as follows: 
Essentially correct (E) if the response correctly compares variability for both distributions. 
Incorrect (I) otherwise. 
Section 3 is scored as follows: 


Essentially correct (EF) if the response includes context related to the variable of interest (household 
size) AND the response correctly identifies the shapes of both distributions. 


Partially correct (P) if the response correctly identifies the shapes of both distributions BUT does NOT 
include context related to the variable of interest (household size), 

OR 
if the response correctly identifies the shape of only one distribution AND includes context related to 
the variable of interest (household size). 


Incorrect (I) otherwise. 
Section 4 is scored as follows: 
Essentially correct (E) if the response correctly states and checks the following two conditions. 
1. The data come from independent random samples 
2. Normality/sample size conditions. 
Partially correct (P) if the response correctly states and checks only one of the two conditions listed 
above, 
OR 
if the response correctly refers to random samples and large sample size, BUT’ does NOT state and 
check either condition correctly. 


Incorrect (I) otherwise. 


Note: The population size condition does not need to be checked to earn E or P. 


Each essentially correct (E) section counts as 1 point. Each partially correct (P) section counts as 4% point. 


4 Complete Response 

3 Substantial Response 
2 Developing Response 
1 Minimal Response 


If a response is between two scores (for example, 22 points), use a holistic approach to decide whether to 
score up or down, depending on the overall strength of the response and communication. 
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(a) Compare the distributions of household size in the metropolitan area for the years 1950 and 2000. 
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(b) A researcher wants to use these data to construct a confidence interval to estimate the change in mean 
household size in the metropolitan area from the year 1950 to the year 2000. State the conditions for using a 
two-sample t-procedure, and explain whether the conditions for inference are met. 
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(b) A researcher wants to use these data to construct a confidence interval to estimate the change in mean 
household size in the metropolitan area from the year 1950 to the year 2000. State the conditions for using a 


two-sample t-procedure, and explain whether the conditions for inference are met. 
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(b) A researcher wants to use these data to construct a confidence interval to estimate the change in mean 


household size in the metropolitan area from the year 1950 to the year 2000. State the conditions for using a 
two-sample t-procedure, and explain whether the conditions for inference are met. 
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Question 3 
Overview 


The primary goals of this question were to assess students’ ability to (1) compare two distributions 
presented with histograms; (2) comment on the appropriateness of using a two-sample t-procedure ina 
given setting. 


Sample: 3A 
Score: 4 


In part (a) the student effectively compares the center and variability of the two distributions, with 
appropriate numerical support. As a result, sections 1 and 2 were scored as essentially correct. The student 
also provides a thorough, accurate description of the shapes of the two distributions, with clear reference to 
household size. Consequently, section 3 was scored as essentially correct. In part (b) the student eloquently 
states and correctly verifies the random samples and normality/large sample size conditions, and section 4 
was scored as essentially correct. With all four sections scored as essentially correct, the response earned a 
score of 4. 


Sample: 3B 
Score: 3 


In part (a) the student begins by correctly identifying the shapes of the two distributions. However, the 
student never mentions the variable of interest (household size). As a result, section 3 was scored as partially 
correct. The student provides an accurate comparison of both center and variability based on the two 
histograms, so sections 1 and 2 were each scored as essentially correct. In part (b) the student correctly states 
and checks the random samples condition. However, the student does not address the normality/large 
sample size condition correctly, suggesting that the condition is not met in spite of the large sample sizes. 
Consequently, section 4 was scored as partially correct. With two sections scored as essentially correct and 
two sections scored as partially correct, the response earned a score of 3. 


Sample: 3C 
Score: 2 


In part (a) the student separately describes the household size distributions in 1950 and 2000 but never 
compares their centers and spreads. As a result, sections 1 and 2 were each scored as incorrect. However, 
the student does correctly describe the shapes of the distributions and includes the context (household size), 
so section 3 was scored as essentially correct. In part (b) the student accurately states and checks both the 
independent random samples and normality/large sample size conditions. Consequently, section 4 was 
scored as essentially correct. With two sections scored as essentially correct and two sections scored as 
incorrect, the response earned a score of 2. 
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